Production of artificial speech



May 27, 1941'.

H. W. DUDLEY PRODUCTION OF ARTIFICIAL SPEECH Filed March 16, 1940 4Sheets -Sheet' l 6A5 FILLED EQUAL/25R AAA l W l M l EQUAL/25R INVEA ITOR H. m DUDLEY A TTORNE Y May 27, 1941.

H. W. DUDLEY PRODUCTION OF ARTIFICIAL SPEECH Filed March 16, 1940 6A).FILLED EQUAL IZER RA NDOMv 4 Sheets-Sheet 2 /-v)v-TOR H. W DUDLEYATTURNEV y 7, 1941. H. w. DUDLEY 2,243,525

PRODUCTION 0F ARTIFICIAL SPEECH v Filed March 16, 1940 4 Sheets-Sheet sFIG. 3

co/v TRQLS RESONANCE BUFFER AMPLIFIERS gum gg I /NVENTOR gg htw DUDLEY ATTORNEV May 27, 1941. H. w. DUDLEY v ,5

PRODUCTION OF ARTIFICIAL SPEECH Filed March 16, 1 940 4 Sheets-Sheet 4ll llll hlllll ATTORNEY Patented May 27,1941

UNITED STATE S PATENT OFFICE" amuse rnonucrron' or numeric srnnon HomerW. Dudley, Garden City, N. E, assignor a Bell Telephone Laboratories,Incorporated, New York, N. Y., a co p ration of New York ApplicationMarch 16, 1940, Serial No. 324,286

' 9 Claims. (01. I'm-d) The present invention relates to the artificialproduction of speech or similar complex waves. The-invention hasparticular reference to, and will be disclosed in connection with, akey-operdied or finger-operated mechanism iorbuilding up understandablespeech waves from waves produced in suitable wave sources andconstituting the raw material out of .which the speech is to beconstructed.

An object of the invention is the artificial production of speech orsimilar waves in simplified and effective manner.

In my prior Patent 2,121,142, June 21, 1938, I

have disclosed a system for artificial speech production from waves oftwo types, a buzzer-like tone wave and a hiss like noise wave, which arecombined in particular manners under control of keys to build up speechwaves; In that tern as'disclosed a number of narrow band filters iorexample, ten filters, are used in the syn thesizing operation, thesefilters having fixed nequency limits and being brought into the circuitin such combinations as needed. Usually only asmall number of thesefilters are needed for any one sound, such as two or three filters, therest of the group being momentarily unused.

The present invention aims at a simplification in apparatus, pecially inthe matter at the irequency selective devices used. The presentinvention achieves such simplification by imitating more closely theprocess carried out by the hu-. man speech mechanism. While the systemof my prior patent has operated successfully as an arusing, inplace ofthe large number of fixed filters the ,tip raised so as to provide tworesonance principal portions. Moreover, as indicated above','

the positions of jaws, tongue and lips used for producing one soundeffect, change by a continuous movement to the positions necessary toproduce the next sound effect, resulting in a continuous transition fromone sound to'another, in a large number of cases.

According to the present invention the con ditions of nature are moreclosely approached by disclosed in my prior patent, a small number ofresonant circuits, such as two, which are variable and which can bevariably controlled to produce sounds that merge, one into the next. as

- the transition is made from one sound to ,an-

tificial speech producer, the use of a large number of fixed filterswhich are switched into and out of circuit in groups hasno'closecounterpart in the human speech mechanism. Rather, in

human speech production the frequency controls are exercised by the airchamber resonances of the mouth comprising P ncipally two variableresonances, and changes aremade mostly by gliding from one sound toanother as a result of continuous or gradual, of abrupt, changes.

In talking, the size of the total mouth cavity is varied lay-movement oithe lower jaw and the lips while the tongue modifies the mouthrosenances in a very important manner by shaping the mouth cavities. Forexample, the tongue by rising in the middle divides the mouth cavityinto two principal chambers with air coupling between themfand theresonances 0! these two chambers are varied between wide limits byvarying the respectiveair chamber sizes and degree figure. Key I is thepitch control key, key II is of coupling. The tongue may be drawn backand other. a

The nature of the invention and its various objects and features will bemade more clear from the following detailed description oi particularembodiments as illustrated in the attached drawinst In the drawings: vFig l is a schematic circuit diagram of a. system for artificiallyproducing speech sounds manually under the control of keys in accordancewith one form of the invention;

Figs. 1A and 1B show modifications respectively that may be substitutedfor the corresponding portion of the circuit of cated by thebrokenlines: I 2 is a schematic circuit diagram similar to Fig. 1, asindi- 1 but with certain simplifications according to the invention;

Fig. 3 is a diagram generally similar to Fig. 1 but showing a modifiedsystem according to the invention; and I v Fig. 4 is ai'ragmentarycircuit schematic showing a modification that may besubstituted in the system ofFig. 1, as indicated by the broken lines.

Referring to Fig. 1 a. system for producing artificial speech is shownin which five keys or controls are used, numbered I, II, 111, .IV and Vin the the key for selecting between the type of energy source (buzz orkeys III and IV are the resonance controls and key V is the stop conso:nant key. w

Theaource of .energy simulating the vocal cords and producing the buzzor complex tone wave is shown as a relaxation oscillator comprisinggasfilled tube l and associated circuits. This may be of the type shownin United States patent to R. R. Piesz 2,183,248, issued December 12,1939.

It produces a wave rich in harmonics; the fundamental of which isreadily variable by controlling the negative grid bias. Equalizer 4'reduces all of the harmonics to the same amplitude or is designed toproduce some other desired relation between them. When key I is-in itsupper position it makes contact at 2, giving a bias to the grid equal tothe sum of the voltages of batteries 3' and 3", which is great enough toprevent production of oscillations. As key I is tiepressed, contact ismade at 2'. The two parts of the resistance 3 and the resistanceadjacent 3" act like a potentiometer so that when contact is made at 2'a-lesser part of the voltage of battery 3" is effective on the grid andthe relaxation oscillator starts to oscillate at a low fundamentalfrequency. As the key is depressed further, a proportionately smallervoltage is applied from source 3" to the grid and the fundamentalfrequency of the oscillator is caused to increase.

The source of continuous spectrum or random frequency noise is shown ascomprising gas tube 5, followed by amplifier 8, and equalizer l' formaking the energy distribution constant over the utilized frequency bandor otherwise suitabLv shaping it.

Energy source key II in its upper position connects oscillator l to theconductor 8 and in its lower position it connects the random noisesource to conductor 8. The other side of the respective source is showngrounded. For simplicity of wiring, Fig. 1 shows use of a ground returnalthough thewgrounds could be replaced by a conductor, if desired.

The function of keys III and IV is to control the tuning of the twoselective circuit branches shown included in conductor 8. The resonantcircuit controlled by key III includes inductances I and II, resistanceII and capacity If. The resonant circuit controlled by key IV includesIn considering the operation of the circuit of Fig. 1, it will beconvenient to have certain frequency relations in mind. It will beassumed for illustration that the random noise covers the human speechband and that for production of normal speech eflects the relaxationoscillator circuit can have its fundamental frequency varied resonanceand resistances II and I8 may be proportioned to give the desired degreeof flatness even to the possibility of having them vary with theinductances under control of extra arms on keys III and IV not shown. Togive illustrative figures, when circuit 9, l0, II, II is adjusted toresonance at 1,000 cycles, it may give ,6 decibels greater attenuationat 250 cycles away from resonance than it has at the resonant frequency;and when circuit l3, l4, l5, I8 is adjusted to resonance at 3,000 cyclesit may give 8 decibels greater attenuation at 450 cycles away fromresonance than at the resonant frequency. These values are not criticaland in practice can be varied widely to produce desired effects.

With these general values of frequencies in mind, it is seen thatmanipulation of key I produces rising and falling inflection anddetermines the normal talking fundamental, while keys III and IV moldthe output of either of the two energy sources, as chosen by key II,into reco nizable sounds similarly to the way in which the human mouthmolds the vocal cord waves and resistance l3, capacity I4 andinductances II and I8. While the resonances of these circuits mightbecontrolled in various ways the particular method exemplified in thisfigure is by varying the saturating current flowing in the windings,which have saturable cores. As either key may be. This current variesthe point of operation on the permeability curve.

Key V when in its top position transmits the waves at full volume to thevariable volume control pad I! and amplifier "leading toloudspeaker I!or other output such as a line. When be determined by trial. Stopconsonant effects are introduced by sharp, quick movements of key V.

While in Fig. 1, as well as in the figures to follow, the use of onlytwo resonance controls is shown it is within the invention to use morethan two resonance controls, and additional ones may be provided bysubstantial duplication of the ones shown. The variation of inductancecan be made to follow any desired law within considerable limits by useof cores in the different inductances saturating at different rates orin different directions as disclosed more fully in my copendingapplication Serial No. 324,288 filed March 16, 1940.

In Figs. 1A and 13 different types of resonance controls are shown whichmay be substituted for the type shown in Fig. 1. Key III in Fig. 1Amoves a sliding contact along inductive winding 20, while key IVsimilarly moves its slider alon inductive winding ii. The slider maymove from units. These prevent storage of energy with re- I range.

way the air cavity of the mouth is divided into sultant clicks as thecontacts are made or broken. The resistances also give some damping ofthe circuit resonance.

Fig. 2 illustrates a system in which a simplification in the number ofcontrols has been made. In this figure the sound wave sources I and 5and their associated circuit elements may be the same as previouslydescribed. Key II of Fig. 1 has been replaced by a relay 24 adapted tobe energized from battery 25 whenever key I' bridges over contacts 26,which is its first operative position, the topmost position being-idl'e, corresponding to such a large bias on the grid that nooscillationsare produced. When relay 24 is energized its armature closesits lower contact and connects the random noise source toconductor 8.Further downward movement of key I releases relay 24 to connect the tonesource I to conductor 8 and such further downward movement of key I alsoresults in controlling the fundamental pitch as previously described.

It is possible to omit the explosive consonant control also. A suddenoncoming of energy of the right frequency distribution will produce agood imitation of the explosive sound desired. This is accomplished inthe circuit of Fig, 2 by providing a contact 21 in the path on theslider of key I. When contact is made at 21 by the rising key I a silentcondition results which can be terminated by lowering the key againresulting in energy changes such as might occur in the sound 9 of theword ago. Similarly, the stop position may be preceded or followed bythe random noise source connection above it. By the proper sequence ofkey I positions and of adjustment of resonance by key III any of thestop consonants can be produced.

In general, the resonance controls are not independent. When a sound hasa certain resonance in the lower frequency range, it will have anassociated resonance in the upper frequency This is more or lessfundamental in the two parts-for producing resonant efiects. In anycase, a certain lower resonance anda certain upper resonance must gotogether for a paroutputs from the energy sources are shown,eachincluding a "bufler" amplifier 3|, 32, 33 or 34 to isolate thetwo'separate output branches of the same oscillator and preventundesired reactions.

The stop consonant key V is similar in construction to that shown inFig. 38 of my copending application for United States patent Serial No.181,275, filed December 23, 1937. and is arranged to provide for themaking of aseries of circuit closures in given sequence and returned tonormal position without operating the contacts in reverse order, if suchaction is desired. The key shank 35 is linked to the slider 36 which inits downwardftravel engages in succession contacts 31, 38, 39, 40 and4|. If the key is not allowed to rise until it has been depressed thefull limit, its end 36 passes under the lower end of guide trigger 42which is urged to the left by spring 43, so that on its upward strokethe end 36 passes on the right-hand side of member 42 and makes contactwith fixed member 44 on such upward stroke. The manner in which it -isused in the figure is illustrative of one type of control. In its restposition' as shown, wiper 36 connects conductor 8 and contact 31directlyt'o the outgoing circuit including volume adjusting network Il,amplifier l8 and loud-speaker l9, When moved into engagement withcontact 38, wiper 36 connects the output of the relaxationoscillaticular sound, Because of these facts there is associated witheach resonance A, a single corresponding resonance B for the optimumproduction of a particular sound. Because of this, it is possible tocombine these two resonant controls into one multiple resonance control.

This has been done in Fig. 2 by providing a single key III with twosliders operating over resistances 28 and 28, respectively, forcontrolling the direct current in the inductive windings of therespective resonant branches, similarly to 4 Fig. 1. By strappingvariously the contacts to points along resistor 29, any type ofvariation such as increasing or decreasing at various rates can besecured by continuous movement of the slider uniformly in the samedirection. The remainder of the system from this point on to the finaloutput may be the same as in Fig. 1, Instead of the particular type ofresonance controls shown, other types could be used such as those shownin Figs. 1A and-1B with suitable mechanical linkage between the twosliders and the single control key.

Fig. 3 shows the same system as Fig. 1 except as modified to include amore elaborate stop contor l throughbufier amplifier 33 and network 45to the outgoing circuit. Thus network 45 is substituted for thevariableresonance control circuits and by. constructing network 45 to havedefinite characteristics the desired resonance effoot is produced whenthis is brought into the circuit. When moved to engage contact 39, wiper36 brings network 46 into circuit between the output of the resonancecontrol circuits and the outgoing circuit. Network 46 may be constructedto give its individual characteristic shaping to the waves. When incontact with member 40, wiper 36 connects the output of the random noisesource 5 through buffer amplifier 34 and shaping network 41 to theoutgoing circuit. This'may be used to give a pronounced sibilant or hissor aspirant or other unvoiced effect and'the resonance of network 41 maybe chosen to select the particular part of the spectrum to give thedesired effect. If depressed only part way the key V retraces itsdownward path on returning, but if depressed to the full extent itreturns wiper 36 on the opposite side of catch 42 so that 36 makescontact withspring 44 which, as shown, has the same electrical eiIect ascontact between 36 and 31. By variously wiring the contacts of key .V',various efiects are obtainable, those described being illustrative ofsome, Moreover, if desired, a plurality of separate keys like key V maybe used in parallel and so wired that each key corresponds to aparticular consonant or group of consonants.

In Fig. 4, the circuit of which is to be substituted in Fig. 1 betweenlines 2-2 and 11-11. separate resonance control circuits are shown forvoice and unvoiced sounds with common control keys 111" and IV"... Inthis arrangement the control keys.

- Each resonance control key carries two sliders which operate overseparate inductance controls in respective circuits. The resonantbranches 3, In, II, I! and I3, I4, l5 and I6 may be the same as in Fig.1 and controlled in exactly similar manner, as shown. In the presentfigure, however, these are used only with" respect to the controlled byresonant branches 9', In, H, I2,

and 13, I4, l5, l6 which may be constructed to have resonant propertiessimilar to or, preferably, different from the resonant circuits for thevoicedenergy. Since the unvoiced sounds usually employhigher frequenciesthan the voiced 'sounds, the resonant branches indicated by primednumerals would be resonant to higher frequencies than the others. Suchan arrangement makes for ease in manipulation, for a key canbe made tocorrespond, in one position, to one frequency condition in the branchcarrying waves representing voiced sounds and to some differentfrequency condition in the consonant sound branches, so that both ofthese resonant conditions can be obtained without movement of thecontrol key in question. Thus the transition from voiced to unvoicedsounds and vice versa can be made with less extent of movement of thecontrol keys in many instances.

What is claimed is: a

1. In a system for producing a source of electrical wave energy offrequency spectrum resembling vocal cord sounds, a source of electricalwave energy of continuous spectrum for simulating unvoiced sounds, meansto select wave energy from either source, means to control variably thewaves from the selected source comprising variable resonant circuitssimulating the effect of the different resonant cavities of the humanmouth.

2. In a system for producing artificial speech, electrical wavegenerating means for producing electrical wave energy having a spectrumdistribution resembling that of vocal cord sounds and other electricalwave energy of continuous spectrum, a switch for selecting either ofsaid types .of wave energy at will, a shaping circuit comprising a pairof resonant circuits capable of having their resonances separatelyvaried over the major part of the speech frequency range, individualresonance controls for said circuits,

means to apply the wave energy selected by said switch to said resonantcircuits, and a sound producer actuated by the energy transmittedthrough said resonant circuits.

3.' In mechanism for producing artificial speech and similar effects, acontrolling circuit, means producing waves consisting of a fundamentaland a plurality of harmonic components in the speech frequency range,means producing other waves of random frequency distribution within thespeech frequency range, selecting means for artificial speech,

movement of the human speech mechanism, and means to translate theenergy in said controlling circuit in-to sounds simulating speech.

4.;A system for the artificial production of speech waves comprisingmeans for generating waves covering the speech frequency range andcomprising afundamental frequency and harmonies thereof, means forgenerating wavesof random frequency distribution, means for producingsound from said waves in sequencessimulating the sound sequences inspeech and a resonant system connected to said generating and soundproducing means having variable resonance means" for emphasizing soundsin different portions of the speech frequency band.

5. In a system for producing artificial speech, a source of electricalwaves comprising a fundamental frequency and harmonics thereof, a sourceof waves of random frequency distribution, means for translating wavesfrom either source into sound, aresonan-t system for transferring Wavesfrom said sources to said translating means comprising circuit branchesof adjustable resonance, and means for varying the resonant frequenciesof said branches.

6. The combination according to claim 5, in-

, eluding manually operated keys for selecting the one of said wavesources to be used and for varying the resonance of said circuitbranches.

7. The method of producing speech artificially comprising generatingelectrical waves simulating the vocal cord vibrations and generatingother electrical waves simulating unvoiced sounds, re-' producing soundsfrom said waves and molding the waves which produce said sounds byproducing resonance effects upon the electrical waves simulating theaction of the resonant air chambers of the mouth and producingcontinuous variations of said resonance effects to make transitions fromone sound to another.

8. In a system for the artificial production of speech, a relaxationoscillator, a random noise source, means for translating electricalwaves into sound waves, variable resonance circuits adapted to connectsaid sources to said translating means and manually operated meansforvarying the resonant frequencies of said resonance circuits.

9. In a systemfor the artificial production of speech, a relaxationoscillator, a random noise source, means for translating electricalwaves into sound waves, variable resonance circuits adapted to connectsaid sources to said translating means, manually operated means forvarying the resonant frequencies of said resonance circuits and a stopconsonant key having a movable contaoting element,- movable over asuccession of stationary contacts for connecting the input of said soundreproducing means in rapid succession to a, succession of shapingnetworks.

HQMER W. DUDLEY.

